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Abstract 

Phonology is the systematic study of the sounds used in language, their internal structure, and their 
composition into syllables, words and phrases. Computational phonology is the application of formal and 
computational techniques to the representation and processing of phonological information. This chapter 
will present the fundamentals of descriptive phonology along with a brief overview of computational 
phonology. 

1 Phonological contrast, the phoneme, and distinctive features 

There is no limit to the number of distinct sounds that can be produced by the human vocal apparatus. 
However, this infinite variety is harnessed by human languages into sound systems consisting of a few 
dozen language-specific categories, or phonemes. An example of an English phoneme is t. English has a 
variety of f-like sounds, such as the aspirated t'' of ten the unreleased f of net, and the flapped r of water 
(in some dialects). In English, these distinctions are not used to differentiate words, and so we do not 
find pairs of English words which are identical but for their use of f'' versus t\ (By comparison, in some 
other languages, such as Icelandic and Bengali, aspiration is contrastive.) Nevertheless, since these sounds 
(or phones, or segments) are phonetically similar, and since they occur in complementary distribution 
(i.e. disjoint contexts) and cannot differentiate words in English, they are all said to be allophones of the 
English phoneme t. 

Of course, setting up a few allophonic variants for each of a finite set of phonemes does not account 
for the infinite variety of sounds mentioned above. If one were to record multiple instances of the same 
utterance by the single speaker, many small variations could be observed in loudness, pitch, rate, vowel 
quality, and so on. These variations arise because speech is a motor activity involving coordination of 
many independent articulators, and perfect repetition of any utterance is simply impossible. Similar varia- 
tions occur between different speakers, since one person's vocal apparatus is different to the next person's 
(and this is how we can distinguish people's voices). So 10 people saying ten 10 times each will produce 
100 distinct acoustic records for the f sound. This diversity of tokens associated with a single type is 
sometimes referred to as free variation. 

Above, the notion of phonetic similarity was used. The primary way to judge the similarity of phones 
is in terms of their place and manner of articulation. The consonant chart of the International Phonetic 
Alphabet (IPA) tabulates phones in this way, as shown in Figure |l|. The IPA provides symbols for all 
sounds that are contrastive in at least one language. 

The major axes of this chart are for place of articulation (horizontal), which is the location in the 
oral cavity of the primary constriction, and manner of articulation (vertical), the nature and degree of that 
constriction. Many cells of the chart contain two consonants, one voiced and the other unvoiced. These 
complementary properties are usually expressed as opposite values of a binary feature [ivoiced]. 

A more elaborate model of the similarity of phones is provided by the theory of distinctive features. 
Two phones are considered more similar to the extent that they agree on the value of their features. A set 
of distinctive features and their values for five different phones is shown in (^. (Note that many of the 
features have an extended technical definition, for which it is necessary to consult a textbook.) 
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THE INTERNATIONAL PHONETIC ALPHABET (revised to 1993) 



CONSONANTS (PULMONIC) 





Bilabial 


Labiodental 


Dental 


Alveolar 


Past alveolar 




PalMitl 


Velar 


L^vular 


Pharyngeal 


Glotial 


Plosive 


P b 




t d 


t 4 




k g 


q G 




? 


Nasiil 


m 




ft 


n. 




q 


N 


^^^^ 




Trill 


B 




X 








R 






Tap or Flap 






r 


t 












Fricative 


<!> P 


f V 


e a 


s z 


I 3 






X y 


X ^ 


h T 


h fi 


Lateral 
fricative 


i^i^ivi..;;;..;;.; 


M 














^^^^ 


Appnoxiiiiatit 




V 


i 


I 


j 










Lateral 
appioTiimaiit 







1 


I 




L 









Where symbols appear ii) pairs, Ilia one to ttifc right represents a voicsd coniotiaht. Sluid«l areas duDole articulations jiulged impossible. 



Figure 1 : Pulmonic Consonants from the International Phonetic Alphabet 

(1) t z m 1 i 

anterior + + + + — 

coronal + + — + — 

labial - - + - - 

distributed — — — — — 

consonantal + + + + — 

sonorant — — + + + 

voiced — + + + + 

approximant — — — + + 

continuant — + — + + 

lateral — — — + — 
nasal 

strident _ _|_ _ _ _ 

Statements about the distribution of phonological information, usually expressed with rules or con- 
straints, often apply to particular subsets of phones. Instead of listing these sets, it is virtually always sim- 
pler to list two or three feature values which pick out the required set. For example [H-labial,-continuant] 
picks out b, p, and m, shown in the top left corner of Figure |l[ Sets of phones which can be picked out 
in this way are called natural classes, and phonological analyses can be evaluated in terms of their re- 
Uance on natural classes. How can we express these analyses? The rest of this chapter discusses some key 
approaches to this question. 

Unfortunately, as with any introductory chapter like this one, it will not be possible to cover many 
important topics of interests to phonologists, such as acquisition, diachrony, orthography, universals, sign 
language phonology, the phonology/syntax interface, systems of intonation and stress, and many others 
besides. However, numerous bibliographic references are supplied at the end of the chapter, and readers 
may wish to consult these other works. 
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2 Early Generative Phonology 



Some key concepts of phonology are best introduced by way of simple examples involving real data. We 
begin with some data from Russian in (^. The example shows some nouns, in nominative and dative 
cases, transcribed using the International Phonetic Alphabet. Note that x is the symbol for a voiceless 
velar fricative (e.g. the ch of Scottish loch). 

(2) 



Nominative 


Dative 


Gloss 


xlep 


xlebu 


'bread' 


grop 


grobu 


'coffin' 


sat 


sadu 


'garden' 


prut 


prudu 


'pond' 


rok 


rogu 


'horn' 


ras 


razu 


'time' 



Observe that the dative form involves suffixation of -u, and a change to the final consonant of the 
nominative form. In (^) we see four changes: p becomes b, t becomes d, k becomes g, and s becomes z- 

Where they differ is in their voicing; for example, is a voiced version of p, since b involves periodic 
vibration of the vocal folds, while p does not. The same applies to the other pairs of sounds. Now we 
see that the changes we observed in are actually quite systematic. Such systematic patterns are called 
alternations, and this particular one is known as a voicing alternation. We can formulate this alternation 
using a phonological rule as follows: 

(3) 

r r 1 

■ [+voiced] / 



C 

-voiced 



A consonant becomes voiced in the presence of a following vowel 

Rule (jsj) uses the format of early generative phonology. In this notation, C represents any consonant 
and V represents any vowel. The rule says that, if a voiceless consonant appears in the phonological 

environment ' V (i.e. preceding a vowel), then the consonant becomes voiced. By default, vowels 

have the feature [+voiced], and so can make the observation that the consonant assimilates the voicing 
feature of the following vowel. 

One way to see if our analysis generalises is to check for any nominative forms that end in a voiced 
consonant. We expect this consonant to stay the same in the dative form. However, it turns out that we do 
not find any nominative forms ending in a voiced consonant. Rather, we see the pattern in example (Q). 
(Note that c is an alternative symbol for IPA tf). 

(4) 



Nominative 


Dative 


Gloss 


cerep 


cerepu 


'skull' 


xolop 


xolopu 


'bondman' 


trup 


trupu 


'corpse' 


cvet 


cvetu 


'colour' 


les 


lesu 


'forest' 


porok 


poroku 


'vice' 



For these words, the voiceless consonants of the nominative form are unchanged in the dative form, 
contrary to our rule (jsj). These cannot be treated as exceptions, since this second pattern is quite pervasive. 
A solution is to construct an artificial form which is the dative wordform minus the -u suffix. We will call 
this the underlying form of the word. Example (||) illustrates this for two cases: 

(5) Underlying Nominative Dative Gloss 

prud prut prudu 'pond' 

cvet cvet cvetu 'colour' 

Now we can account for the dative form simply by suffixing the -u. We account for the nominative 
form with the following devoicing rule: 
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(6) 



C 

-voiced 



A consonant becomes devoiced word-finally 



[—voiced] / ^ 



This rule states that a voiced consonant is devoiced (i.e. [+voiced] becomes [-voiced]) if the consonant 
is followed by a word boundary (symbolised by #). It solves a problem with rule |^ which only accounts 
for half of the data. Rule ^ is called a neutralisation rule, because the voicing contrast of the underlying 
form is removed in the nominative form. Now the analysis accounts for all the nominative and dative 
forms. Typically, rules like (^ can simultaneously employ several of the distinctive features from (|l|). 

Observe that our analysis involves a certain degree of abstractness. We have constructed a new level 
of representation and drawn inferences about the underlying forms by inspecting the observed surface 
forms. 

To conclude the development so far, we have seen a simple kind of phonological representation 
(namely sequences of alphabetic symbols, where each stands for a bundle of distinctive features), a dis- 
tinction between levels of representation, and rules which account for the relationship between the rep- 
resentations on various levels. One way or another, most of phonology is concerned about these three 
things: representations, levels, and rules. 

Finally, let us consider the plural forms shown in example (Q). The plural morpheme is either -a or -y. 

(7) 



Singular 


Plural 


Gloss 


xlep 


xleba 


'bread' 


grop 


groby 


'coffin' 


cerep 


cerepa 


'skull' 


xolop 


xolopy 


'bondman' 


trup 


trupy 


'corpse' 


sat 


sady 


'garden' 


prut 


prudy 


'pond' 


cvet 


cveta 


'colour' 


ras 


razy 


'time' 


les 


lesa 


'forest' 


rok 


roga 


'horn' 


porok 


poroky 


'vice' 



The phonological environment of the suffix provides us with no way of predicting which allomorph is 
chosen. One solution would be to enrich the underlying form once more (for example, we could include 
the plural suffix in the underlying form, and then have rules to delete it in all cases but the plural). A 
better approach in this case is to distinguish two morphological classes, one for nouns taking the -y 
plural, and one for nouns taking the -a plural. This information would then be an idiosyncratic property 
of each lexical item, and a morphological rule would be responsible for the choice between the -y and 
-a allomorphs. A full account of this data, then, must involve phonological, morphological and lexical 
modules of a grammar 

As another example, let us consider the vowels of Turkish. These vowels are tabulated below, along 
with a decomposition into distinctive features: [high], [back] and [round]. The features [high] and [back] 
relate to the position of the tongue body in the oral cavity. The feature [round] relates to the rounding of 
the lips, as in the English w sound.f] 



(8) 




u 





ii 6 


1 a i e 




high 


+ 




+ - 


+ - + - 




back 


+ 


+ 




+ + - - 




round 


+ 


+ 


+ + 





Note that there is a distinction made in the Turkish alphabet between the dotted / and the dotless i. This ; is a high, back, 
unrounded vowel that does not occur in English. 
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Consider the following Turkish words, paying particular attention to the four versions of the possessive 
suffix. Note that similar data are discussed in chapter 2. 



ip 


'rope' 


ipin 


'rope's' 


kiz 


'girl' 


kizm 


'girl's' 


yiiz 


'face' 


yiiziin 


'face's' 


pul 


'stamp' 


pulun 


'stamp's' 


el 


'hand' 


elin 


'hand's' 


9 an 


'bell' 


ganm 


'bell's' 


koy 


'village' 


koyiin 


'village's 


son 


'end' 


sonun 


'end's' 



The possessive suffix has the forms in, in, un and un. In terms of the distinctive feature chart in (g), we 
can observe that the suffix vowel is always [+high]. The other features of the suffix vowel are copied from 
the stem vowel. This copying is called vowel harmony. Let us see how this behaviour can be expressed 
using a phonological rule. To do this, we assume that the vowel of the possessive affix is only specified as 
[+high] and is underspecifled for its other features. In the following rule, C denotes any consonant, and 
the Greek letter variables range over the + and - values of the feature. 

(10) 



" V 




aback 


/ 


aback 


+high 




/Ground 




/3round 



c* 



A high vowel assimilates to the backness and rounding of the preceding vowel 

So long as the stem vowel is specified for the properties [high] and [back], this rule will make sure that 
they are copied onto the affix vowel. However, there is nothing in the rule formalism to stop the variables 
being used in inappropriate ways (e.g. a back a round). So we can see that the rule formalism does 
not permit us to express the notion that certain features are shared by more than one segment. Instead, 
we would like to be able to represent the sharing explicitly, as follows, where ±H abbreviates [±high], an 
underspecified vowel position: 

(11) 



9 -H n +H n 




+back 
-round 



k +H y +H n 




-back 
+round 



The lines of this diagram indicate that the backness and roundness properties are shared by both vowels 
in a word. A single vowel property (or type) is manifested on two separate vowels (tokens). 

Entities like [+back,-round] that function over extended regions are often referred to as prosodies, 
and this kind of picture is sometimes called a non-Unear representation. Many phonological models use 
non-linear representations of one sort or another Here we shall consider one particular model, namely 
autosegmental phonology, since it is the most widely used non-linear model. The term comes from 
'autonomous + segment', and refers to the autonomous nature of segments (or certain groups of features) 
once they have been liberated from one-dimensional strings. 



3 Autosegmental Phonology 

In autosegmental phonology, diagrams like those we saw above are known as charts. A chart consists of 
two or more tiers, along with some association lines drawn between the autosegments on those tiers. The 
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no-crossing constraint is a stipulation that association lines are not allowed to cross, ensuring that asso- 
ciation lines can be interpreted as asserting some kind of temporal overlap or inclusion. Autosegmental 
rules are procedures for converting one representation into another, by adding or removing association 
lines and autosegments. A rule for Turkish vowel harmony is shown below on the left in (12), where V 
denotes any vowel, and the dashed line indicates that a new association is created. This rule applies to the 
representation in the middle, to yield the one on the right. 



(12) 



V C* V 

/ 

/ 



V 
+back 
-round 



+back 
-round 



+H 



9 -H n +H n 




+back 
-round 



In order to fully appreciate the power of autosegmental phonology, we will use it to analyse some data 
from an African tone language. Consider the data in Table ^ Twelve nouns are listed down the left side, 
and the isolation form and five contextual forms are provided across the table. The line segments indicate 
voice pitch (the fundamental frequency of the voice); dotted lines are for the syllables of the context words, 
and full hnes are for the syllables of the target word, as it is pronounced in this context. At first glace this 
data seems bewildering in its complexity. However, we will see how autosegmental analysis reveals the 
simple underlying structure of the data. 

Looking across the table, observe that the contextual forms of a given noun are quite variable. For 

example bulali appears as — ~— , ~ — _ — , and ~ . 

We could begin the analysis by identifying all the levels (here there are five), assigning a name or 
number to each, and looking for patterns. However, this approach does not capture the relative nature 

of tone, where — — _ is not distinguished from ~ — . Instead, our approach just has to be sensitive to 
differences between adjacent tones. So these distinct tone sequences could be represented identically as 
+1, —2, since we go up a small amount from the first to the second tone (+1), and then down a larger 
amount (—2). In autosegmental analysis, we treat contour tones as being made up of two or more level 

tones compressed into the space of a single syllable. Therefore, we can treat — ^ as another instance of 
+1, —2. Given our autosegmental perspective, a sequence of two or more identical tones corresponds to 
a single spread tone. This means that we can collapse sequences of like tones to a single tone.^ When we 
retranscribe our data in this way, some interesting patterns emerge. 

First, by observing the raw frequency of these intertone intervals, we see that —2 and +1 are by far 
the most common, occurring 63 and 39 times respectively. A —1 difference occurs 8 times, while a +2 
difference is very rare (only occurring 3 times, and only in phrase-final contour tones). This patterning is 
characteristic of a terrace tone language. In analysing such a language, phonologists typically propose 
an inventory of just two tones, H (high) and L (low), where these might be represented featurally as [±hi]. 

In such a model, the tone sequence HL corresponds to ~— , a pitch difference of —2. 

^ This assumption cannot be maintained in more sophisticated approaches involving lexical and prosodic domains. However, it is 
a very useful simplifying assumption for the purposes of this presentation. 
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Wordform 


A. 


isolation 


B. 

' 

'his ..." 


c. 

am goro 

'your (pi) 
brother's ..." 


D. 

ku 

'one 


E. 

am wo do 

'your (pi) ..." 
is there' 


F. 

jiine ni 

'that ..." 


1. baka 'tree' 


_ — 


— _ 




_ — 




— 


2. saka 'comb' 


— ~N 





— ^ 










3. buri 'duck' 


— 


— 












4. siri 'goat' 


N 




^ 










5. gado 'bed' 


















6. goro 'brother' 




— _ 










7. ca 'dog' 






^ 






— 


8. ni 'mother' 









— 




— 


9. jokoro 'chain' 














10. tokoro 'window' 














11. bulali 'iron' 














12. inisini 'needle' 















Table 1 : Tone Data from Chakosi (Ghana) 



In terrace tone languages, an H tone does not achieve its former level after an L tone, so HLH is pho- 
netically realized as ~ — — , (instead of ~ — This kind of H-lowering is called automatic downstep. 
A pitch difference of +1 corresponds to an LH tone sequence. With this model, we already account for 
the prevalence of the —2 and +1 intervals. What about —1 and +2? 

As we will see later, the —1 difference arises when the middle tone of ~ — — (HLH) is deleted, leaving 

just ~— . In this situation we write H!H, where the exclamation mark indicates the lowering of the follow- 
ing H due to a deleted (or floating low tone). This kind of H-lowering is called conditioned do'wnstep. 
The rare +2 difference only occurs for an LH contour; we can assume that automatic downstep only ap- 
plies when a LH sequence is linked to two separate syllables (— — ) and not when the sequence is linked to 
a single syllable (-^. 

To summarise these conventions, we associate the pitch differences to tone sequences as shown in 
(p3|). Syllable boundaries are marked with a dot. 

(13) Interval -2 -1 +1+2 

Pitches — _ — — — — ^ 
Tones H.L H.!H L.H LH 

Now we are in a position to provide tonal transcriptions for the forms in Table |l]. Example ( [l4| ) gives 
the transcriptions for the forms involving bulali. Tones corresponding to the noun are underlined. 

(14) Transcriptions of bulaU 'iron' 

bulah 'iron' — ~— L.H.L 

ibulah 'his iron' H. H.IH.L 

am goro bulali 'your (pi) brother's iron' — — _ HL.L.L. L.H.L 

bulali ku 'one iron' — L.H. H .L 

am bulali wo do 'your (pi) iron is there' ■— HL. L.H.H . !H.L 

jiine bulali ni 'that iron' ~ L.H. H.IH.H .L 
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Looking down the right hand column of ( p^ at the underlined tones, observe again the diversity of 
surface forms corresponding to the single lexical item. An autosegmental analysis is able to account for 
all this variation with a single spreading rule. 

(15) High Tone Spread 

a o a 
/ 

/ ---- 

/ 

H L 



A high tone spreads to the following (non-final) syllable, delinking the low tone 

Rule jl^ applies to any sequence of three syllables (a) where the first is linked to an H tone and the 
second is linked to an L tone. The rule spreads H to the right, delinking the L. Crucially, the L itself is 
not deleted, but remains as a floating tone, and continues to influence surface tone as downstep. Example 
( |l6| ) shows the application of the H spread rule to forms involving bulali. The first row of autosegmental 
diagrams shows the underlying forms, where bulali is assigned an LHL tone melody. In the second row, 
we see the result of applying H spread. Following standard practice, the floating low tones are circled. 
Where a floating L appears between two H tones, it gives rise to downstep. The final assignment of tones 
to syllables and the position of the downsteps are shown in the last row of the table. 



(16) B. 'his iron' 

i bu la li 

H L H L 



D. 'one iron' 
bu la li 



ku 



E. 'your (pi) iron' F. 'that iron' 

am bu la li wo do jii ni bu la li 



L H L L 



HLLHLHL LHLHLL 



i bu la li bu la li kQ am bu la li wo do jii ni bu la li ni 

h(l)hl lh(l)l hllh(l)hl lh(l)h(l)l 



i bu la li 
H H !H L 



bu la li ku 
L H H L 



am bu la li wo do jii ni bu la li ni 
HLLHH!HL LHH!HHL 



Example ( |16[ ) shows the power of autosegmental phonology - together with suitable underlying forms 
and appropriate principles of phonetic interpretation - in analysing complex patterns with simple rules. 
Space precludes a full analysis of the data; interested readers can try hypothesising underlying forms for 
the other words, along with new rules, to account for the rest of the data in Table |l]. 

The preceding discussion of segmental and autosegmental phonology highlights the multi-linear or- 
ganisation of phonological representations, which derives from the temporal nature of the speech stream. 
Phonological representations are also organised hierarchically. We already know that phonological infor- 
mation comprises words, and words, phrases. This is one kind of hierarchical organisation of phonological 
information. But phonological analysis has also demonstrated the need for other kinds of hierarchy, such 
as the prosodic hierarchy, which builds structure involving syllables, feet and intonational phrases above 
the segment level, and feature geometry, which involves hierarchical organisation beneath the level of 
the segment. Phonological rules and constraints can refer to the prosodic hierarchy in order to account 
for the observed distribution of phonological information across the linear sequence of segments. Fea- 
ture geometry serves the dual purpose of accounting for the inventory of contrastive sounds available to a 
language, and for the alternations we can observe. Here we will consider just one level of phonological 
hierarchy, namely the syllable. 
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4 Syllable Structure 



Syllables are a fundamental organisational unit in phonology. In many languages, phonological alterna- 
tions are sensitive to syllable structure. For example, t has several allophones in English, and the choice 
of allophone depends on phonological context. For example, in many English dialects, t is pronounced 
as the flap [r] between vowels, as in water. Two other variants are shown in (p^, where the phonetic 
transcription is given in brackets, and syllable boundaries are marked with a dot. 



(17) 



atlas [set'.lgs] 
cactus [ksek.t'^gs] 



Native English syllables cannot begin with tl, and so the t of atlas is syllabified with the preceding 
vowel. Syllable final f is regularly glottalised or unreleased in English, while syllable initial f is regularly 
aspirated. Thus we have a natural explanation for the patterning of these allophones in terms of syllable 
structure. 

Other evidence for the syllable comes from loanwords. When words are borrowed into one language 
from another, they must be adjusted so as to conform to the legal sound patterns (or phonotactics) of the 
host language. For example, consider the following borrowings from English into Dschang, a language of 



Cameroon ( [Bird, 1999[ ). 

(18) afruwa flower, akalatusi eucalyptus, alesa razor, abba rubber, apleijgE blanket, asgkuu school, ceen 
chain, dggk debt, kapinda carpenter, kesii) kitchen, kuum comb, laam lamp, lesi rice, luum room, 
mbasgku bicycle, mbrusi brush, mbgrggk brick, meta mat, metgrasi mattress, ijglasi glass, pjakasi 
jackass, metisi match nubatisi rheumatism, pake pocket ijgale garden, sgsa scissors, tewele towel, 
wasi watch, ziii) zinc. 

In Dschang, the syllable canon is much more restricted than in English. Consider the patterning of t. 
This segment is illegal in syllable-final position. In technical language, we would say that alveolars are 
not licensed in the syllable coda. In meta mat, a vowel is inserted, making the f into the initial segment 
of the next syllable. For dgak debt, the place of articulation of the t is changed to velar, making it a legal 
syllable-final consonant. For apleijge blanket, the final f is deleted. Many other adjustments can be seen 
in ([l8|), and most of them can be explained with reference to syllable structure. 

A third source of evidence for syllable structure comes from morphology. In Ulwa, a Nicaraguan 
language, the position of the possessive infix is sensitive to syllable structure. The Ulwa syllable canon is 
(C)V(VIC)(C), and any intervocalic consonant (i.e. consonant between two vowels) is syllabified with the 
following syllable, a universal principle known as onset maximisation. Consider the Ulwa data in ( 19 ). 



(19) 



Word 


Possessive 


Gloss 


Word 


Possessive 


Gloss 


baa 


baa.ka 


'excrement' 


bi.lam 


bi.lam.ka 


'fish' 


dii.muih 


dii.ka.muih 


'snake' 


gaad 


gaad.ka 


'god' 


ii.bin 


ii.ka.bin 


'heaven' 


ii.H.lih 


ii.ka.li.lih 


'shark' 


kah.ma 


kah.ka.ma 


'iguana' 


ka.pak 


ka.pak.ka 


'manner' 


Hi. ma 


Ui.ka.ma 


'lemon' 


mis.tu 


mis.ka.tu 


'cat' 


on.yan 


on.ka.yan 


'onion' 


pau.mak 


pau.ka.mak 


'tomato' 


sik.bilh 


sik.ka.bilh 


'horsefly' 


taim 


taim.ka 


'time' 


tai.tai 


tai.ka.tai 


'grey squirrel' 


uu.mak 


uu.ka.mak 


'window' 


wai.ku 


wai.ka.ku 


'moon, month' 


wa.sa.la 


wa.sa.ka.la 


'possum' 



Observe that the infix appears at a syllable boundary, and so we can already state that the infix position 
is sensitive to syllable structure. Any analysis of the infix position must take syllable weight into consid- 
eration. Syllables having a single short vowel and no following consonants are defined to be light. (The 
presence of onset consonants is irrelevant to syllable weight.) All other syllables, i.e. those which have 
two vowels, or a single long vowel, or a final consonant, are defined to be heavy; e.g. kah, kaa, muih, bilh, 
a, on. Two common phonological representations for this syllable structure are the onset-rhyme model, 
and the morale model. Representations for the syllables just listed are shown in (^0[). In these diagrams, 
(T denotes a syllable, O onset, R rhyme, N nucleus, C coda and ji mora (the traditional, minimal unit of 
syllable weight). 
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(20) a. 



The Onset-Rhyme Model of Syllable Structure 



?! U 'ik U t K 

kN kNC kN mNC bNC N NC 

I IT A AT lA A IT 



a ah aa uih ilh 

b. The Morale Model of Syllable Structure 



A A 







k n 


m ^ 

1 A 


b u u 

1 A 


a 


a h 


a a 


u i h 


i 1 h 



In the onset-rhyme model (|20p), consonants coming before the first vowel are linked to the onset node, 
and the rest of the material comes under the rhyme node^ A rhyme contains an obligatory nucleus and 
an optional coda. In this model, a syllable is said to be heavy if and only if its rhyme or its nucleus are 
branching. 



In the moraic mode (203), any consonants that appear before the first vowel are linked directly to 
the syllable node. The first vowel is linked to its own mora node (symbolised by /i), and any remaining 
material is linked to the second mora node. A syllable is said to be heavy if and only if it has more than 
one mora. 

These are just two of several ways that have been proposed for representing syllable structure. Now 
the syllables constituting a word can now be linked to higher levels of structure, such as the /oof and the 
prosodic word. For now, it is sufficient to know that such higher levels exist, and that we have a way to 
represent the binary distinction of syllable weight. 

Now we can return to the Ulwa data, from example (p^. A relatively standard way to account for the 
infix position is to stipulate that the first light syllable, if present, is actually invisible to the rules which 
assign syllables to higher levels; such syllables are said to be extra-metrical. They are a sort of 'upbeat' 
to the word, and are often associated with the preceding word in continuous speech. Given these general 
principles concerning hierarchical structure, we can simply state that the Ulwa possessive affix is infixed 
after the first sy liable. ^ 

In the foregoing discussion, 1 hope to have revealed many interesting issues which are confronted by 
phonological analysis, without delving too deeply into the abstract theoretical constructs which phonol- 
ogists have proposed. Theories differ enormously in their organisation of phonological information and 
the ways in which they permit this information to be subjected to rules and constraints, and the way the 
information is used in a lexicon and an overarching grammatical framework. Some of these theoretical 
frameworks include: lexical phonology, underspecification phonology, government phonology, declara- 



tive phonology, and optimality theory. For more information about these, please see §5.3 for literature 
references. 



5 Computational phonology 

When phonological information is treated as a string of atomic symbols, it is immediately amenable to 
processing using existing models. A particularly successful example is the work on finite state transducers 

^Two syllables usually have to agree on the material in their rhyme constituents in order for them to be considered rhyming, hence 
the name. 

A better analysis of the Ulwa infixation data involves reference to metrical feet, phonological units above the level of the 
syllable. This is beyond the scope of the current chapter however. 
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(see chapter 21). However, phonologists abandoned hnear representations in the 1970s, and so we will 
consider some computational models that have been proposed for multi-linear, hierarchical, phonological 
representations. It turns out that these pose some interesting challenges. 

Early models of generative phonology, like that of the Sound Pattern of English (SPE), were suffi- 
ciently explicit that they could be implemented directly. A necessary first step in implementing many of 
the more recent theoretical models is to formalise them, and to discover the intended semantics of some 
subtle, graphical notations. A practical approach to this problem has been to try to express phonological 
information using existing, well-understood computational models. The principal models are finite state 
devices and attribute-value matrices. 



5.1 Finite state models of non- linear phonology 

Finite state machines cannot process structured data, only strings, so special methods are required for 
these devices to process complex phonological representations. All approaches involve a many-to-one 
mapping from the parallel layers of representation to a single machine. There are essentially three places 
where this many-to-one mapping can be situated. The first approach is to employ multi-tape machines 



(ECay, 1987). Each tier is represented as a string, and the set of strings is processed simultaneously by a 



single machine. The second approach is to map the multiple layers into a single string, and to process that 



with a conventional single-tape machine (Kornai, 1995). The third approach is to encode each layer itself 



as a finite state machine, and to combine the machines using automaton intersection (Bird and Ellison, 
1994). 

This work demonstrates how representations can be compiled into a form that can be directly manip- 
ulated by finite state machines. Independently of this, we also need to provide a means for phonological 
generalisations (such as rules and constraints) to be given a finite state interpretation. This problem is well 
studied for the linear case, and compilers exist that will take a rule formatted somewhat like the SPE style 
and produce an equivalent finite state transducer. Whole constellations of ordered rules or optimality- 
theoretic constraints can also be compiled in this way. However, the compilation of rules and constraints 
involving autosegmental structures is still largely un-addressed. 

The finite state approaches emphasise the temporal (or left-to-right) ordering of phonological repre- 
sentations. In contrast, attribute-value models emphasise the hierarchical nature of phonological represen- 
tations. 



5.2 Attribute- value matrices 



The success of attribute-value matrices (AVMs) as a convenient formal representation for constraint-based 
approaches to syntax (see chapter 3), and concerns about the formal properties of non-linear phonological 
information, led some researchers to apply AVMs to phonology. Hierarchical structures can be represented 
using AVM nesting, as shown in (pT|a), and autosegmental diagrams can be encoded using AVM indexes, 
as shown in (pl]b). 

(21) a. ojjggj 



rhyme 



nucleus {u, i 
coda ( h 



syllable {i^^, hn^, la^, hgj 
tone (H[5], L[g], H^, 

associations <! ( [T], /d, dV /[I, ), ( S, [1 



AVMs permit re-entrancy by virtue of the numbered indexes, and so parts of a hierarchical structure 
can be shared. For example, (|2^a) illustrates a consonant shared between two adjacent syllables, for the 
word cousin (this kind of double affiliation is called ambisyllabicity). Example (p2|b) illustrates shared 



11 



structure within a single syllable full, to represent the coarticulation of the onset consonant with the 
vowel. 



(22) a. 



syllable 



b. 



onset 



rhyme 



onset 



rhyme 



nucleus 
coda 



onset 



rhyme 



nucleus 
coda 



consonantal 



source 



grave 
compact 

voice - 
continuant + 



^ grave + 
vocahc S , . , 

height close 

nucleus | vocalic E] 

consonantal 

coda 



vocalic 
source I nasal 1 



grave - 

compact - 

grave + 

compact + 



Given such flexible and extensible representations, rules and constraints can manipulate and enrich 
the phonological information. Computational implementations of these AVM models have been used in 
speech synthesis systems. 



5.3 Computational Tools for Phonological Research 

Once a phonological model is implemented, it ought to be possible to use the implementation to evaluate 
theories against data sets. A phonologist's workbench should help people to 'debug' their analyses and 
spot errors before going to press with an analysis. Developing such tools is much more difficult than it 
might appear. 

First, there is no agreed method for modelling non-Unear representations, and each proposal has short- 
comings. Second, processing data sets presents its own set of problems, having to do with tokenisation, 
symbols which are ambiguous as to their featural decomposition, symbols marked as uncertain or op- 
tional, and so on. Third, some innocuous looking rules and constraints may be surprisingly difficult to 
model, and it might only be possible to approximate the desired behaviour. Additionally, certain universal 
principles and tendencies may be hard to express in a formal manner. A final, pervasive problem is that 
symbolic transcriptions may fail to adequately reflect linguistically significant acoustic differences in the 
speech signal. 

Nevertheless, whether the phonologist is sorting data, or generating helpful tabulations, or gathering 
statistics, or searching for a (counter-)example, or verifying the transcriptions used in a manuscript, the 
principal challenge remains a computational one. Recently, new directed-graph models (e.g. Emu, MATE, 
Annotation Graphs) appear to provide good solutions to the first two problems, while new advances on 
finite-state models of phonology are addressing the third problem. Therefore, we have grounds for confi- 
dence that there wiU be significant advances on these problems in the near future. 



12 



Further reading and relevant resources 



The phonology community is served by an excellent jo urnal Phonology , published by Cambrid ge Univer- 



sity Press. Useful textbooks and collections include: (Katamba, 1989; [Frost and Katz, 1992; Kenstow 



put 
[Pre 



icz, 1994; [Goldsmith, 1995|; pai-k and Yallop 1995|; [Gussenhoven and Jacobs, 1998| ; [Goldsmith, 1999| ; 
l^oca et al., 1999| ; Jurafsky and Martin, 2000 ; Harrington and Cassidy, 2000 ). Oxford University Press 
publishes a ser ies The Phon ology of t he World's Langua ges, inclu ding monogra phs on Armenian (Vaux, 
1998), Dutch ( [Booij, 1995] ), Engl ish ([Hammond, 1999| ), Ger man ([Wiese, 1996[), H ungarian (Siptar and 
Torkenczy, 2000). Kimatuumbi (Odden, 1996), Norwegian (Kristoffersen, 1996), Portuguese (Mateus 
and d'Andrade, 2000), and Slovak (Rubach, 1993). An important forthcoming survey of phonological 
variation is the Atlas of North American English (Labov et al., 2001). 

Phonology is the oldest discipline in linguistics and has a rich history. Some historically important 
works include: ( [Joos, 1957| ; ^ike, 1947[ ; [Firth, 1948| ; [Bloch, 1948|; ^ockett, 1955[ ; Chomsky and Halle, 



1968). The most comprehensive history of phonology is (Anderson, 1985) 



Useful resources for phonetics include: (Catford, 1988; Laver, 1994; Ladefoged and Maddieson, 1996; 
Stevens, 1999; international Phonetic Association, 1999; ; iLadefoged, 2000|; [Handke, 200 1[), and the home- 



page of the International Phonetic Association http : //i 



.arts.gla.ac.uk/ I PA/ ipa . html 



The phonology/phonetics interface is an area of vigorous research, and the main focus of the Laboratory 
Phonology series published by Cambridge: (Kingston and Beckman, 1991; Docherty and Ladd, 1992; 



Keating, 1994; Connell and Arvaniti, 1995; Broe and Pierrehumbert, 2000). Two interesting essays on the 
relationship between phonetics and phonology are ( Pierrehumbert, 199C| ; Fleming, 2000). Coleman has 
shown that in Tashlhiyt Berber (Morocco), where many words appear to have no vowels, careful phonetic 
analysis dramatically simplifies the phonological analysis of syllable structure (Coleman, 2001). 

I mportant works on the syllable, stress, intonation and tone include the following: (P ike and Pike, 
1947; [Liberman and Prince, 1977|;|Burzio l994[; [Hayes, 1994[; [Blevins, 1995[ ; [Ladd, 1996| ; Hirst and Di 



and redundancy include: (Archangeli, 1988 



Cristo, 1998; [Hyman and Kisseberth, 1998[; yan der Hulst and Ritter, 1999|). Studies of partial specification 



Broe, 1993; i\rchangeU and Pulleyblank, 1994). 



Attribute-value and directed graph models for phonological represent a tions and constraints are de- 
scribed in the following papers and monographs: (Bird and Klein, 1994 ; Bird, 1995 ; Coleman, 1998 ; 
Scobbie, 1998 ; Bird and Liberman, 2001 ; Cassidy and Harrington, 2001 ). 

The last decade has seen two major developments in phonology, both falling outside the scope of this 
limited chapter. On the theoretical side, Alan Prince, Paul Smolensky, John McCarthy and many others 
have developed a model of constraint interaction called Optimality Theory (OT) (Archangeli and Lan- 
gendoen, 1997; Kager, 1999; Tesar and Smolensky, 2000). The Rutgers Optimality Archive houses an 
extensive collection of OT papers [http : //ruccs . rutgers . edu/roa . html]. On the computa- 
tional side, the Association for Computational Linguistics (ACL) has a special interest group in computa 



tional phonology (SIGPHON) with a homepage at http : / / www .cogsci.ed.ac. uk/sigphon/ 



The organization has held five meetings to date, with proceedings published by the ACL and many pa- 
pers available onli ne from the SIGPHON site: ( ^ird, 1994b| ; [Sproat, 1996] ; [Coleman, 19"97| ; [ElUson, 199^ ; 
|5isner et al., 2000 ). Another collection of papers was published as a special issue of the journal Computa- 
tional Linguistics in 1994 (Bird, 1994a). Several PhD theses on computational phonology have appeared: 
([Bird, 1995[; [Kornai, 1995[; [Tesar, 1995[; [Carson-Berndsen, 1997[; [Walther, 1997[; [Boersina, 1998[; Ware- 



ham, 1999; Kiraz, 2000). Key contributions to computational OT include the proceedings of the fourth 



and fifth SIGPHON meetings, and ( [Ellison, 1994[ [Tesar, 1995[ ; [Eisner, 1997[ ; [Karttunen, 1998[ ). 

The sources of data published in this chapter are as follows: Russian (K enstowicz and Kisseberth, 
1979); Chakosi (Ghana: Language Data Series, ms); Ulwa ( [Sproat, 1992[ :4~. 
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